Mike Middleton
https://orcid.org/0000-0001-5813-6347
Version 1.0, March 2024.
This research was begun in March 2022.
Colab Notebook: Live code (Must be logged into Google. Select Google Colaboratory, at the top of the screen, if page opens as raw code)
HTML: Read only
Pre-processed data and images are available for download (without the need to run the code in these files) here:
https://github.com/MikeDairsie/Hillforts-Primer.
To review only confirmed hillforts (see Part 1: Status, Data Reliability), download, save images or to change the background image to show the topography, first save a copy of this document into your Google Drive folder. Once saved, change, confirmed_only, download_data, save_images and/or show_topography to True in the code blocks below, Save and then select Runtime>Run all in the main menu above to rerun the code. If selected, running the code will initiate the download and saving of files. Each document will download a number of data packages and you may be prompted to allow multiple downloads. Be patient, downloads may take a little time after the document has finished running. Note that each part of the Hillforts Primer is independent and the download, save_image and show_topography variables will need to be enabled in each document, if this functionality is required. Also note that saving images will activate the Google Drive folder and this will request the user to allow access. Selecting show_topography will change the background image to a colour topographic map. It should also be noted that, if set to True, this view will only show the distribution of the data selected. It will not show the overall distribution as a grey background layer as is seen when using the simple coastal outlines.
confirmed_only = False
download_data = False
save_images = False
show_topography = False
# Percentage above which Appendix 02 plots will display
# minimum is 10%
show_percentage = 20
The initial sections of all the Hillforts Primer documents set up the coding environment and define functions used to plot, reprocess and save the data. If you would like to bypass the setup, please use the following link:
Go to Review Data Appendix 2.
The Atlas of Hillforts of Britain and Ireland data is made available under the licence, Attribution-ShareAlike 4.0 International (CC BY-SA 4.0). This allows for redistribution, sharing and transformation of the data, as long as the results are credited and made available under the same licence conditions.
The data was downloaded from The Atlas of Hillforts of Britain and Ireland website as a csv file (comma separated values) and saved onto the author’s GitHub repository thus enabling the data to be used by this document.
Lock, G. and Ralston, I. 2017. Atlas of Hillforts of Britain and Ireland. [ONLINE] Available at: https://hillforts.arch.ox.ac.uk
Rest services: https://maps.arch.ox.ac.uk/server/rest/services/hillforts/Atlas_of_Hillforts/MapServer
Licence: https://creativecommons.org/licenses/by-sa/4.0/
Help: https://hillforts.arch.ox.ac.uk/assets/help.pdf
Data Structure: https://maps.arch.ox.ac.uk/assets/data.html
Hillforts: Britain, Ireland and the Nearer Continent (Sample): https://www.archaeopress.com/ArchaeopressShop/DMS/A72C523E8B6742ED97BA86470E747C69/9781789692266-sample.pdf
Map outlines made with Natural Earth. Free vector and raster map data @ naturalearthdata.com.
import sys
print(f'Python: {sys.version}')
import sklearn
print(f'Scikit-Learn: {sklearn.__version__}')
import pandas as pd
print(f'pandas: {pd.__version__}')
import numpy as np
print(f'numpy: {np.__version__}')
%matplotlib inline
import matplotlib
print(f'matplotlib: {matplotlib.__version__}')
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import matplotlib.patches as mpatches
import matplotlib.patches as patches
from matplotlib.cbook import boxplot_stats
from matplotlib.lines import Line2D
import matplotlib.cm as cm
import seaborn as sns
print(f'seaborn: {sns.__version__}')
sns.set(style="whitegrid")
import scipy
print(f'scipy: {scipy.__version__}')
from scipy import stats
from scipy.stats import gaussian_kde
import os
import collections
import math
import random
import PIL
import urllib
random.seed(42) # A random seed is used to ensure that the random numbers
# created are the same for each run of this document.
from slugify import slugify
# Import Google colab tools to access Drive
from google.colab import drive
Python: 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0] Scikit-Learn: 1.2.2 pandas: 1.5.3 numpy: 1.25.2 matplotlib: 3.7.1 seaborn: 0.13.1 scipy: 1.11.4
The following functions will be used to plot data later in the document.
def show_records(plt, plot_data):
text_colour = 'k'
if show_topography == True:
text_colour = 'w'
plt.annotate(str(len(plot_data))+' records', xy=(-1180000, 6420000), \
xycoords='data', ha='left', color=text_colour)
def get_backgrounds():
if show_topography == True:
backgrounds = ["hillforts-topo-01.png",
"hillforts-topo-north.png",
"hillforts-topo-northwest-plus.png",
"hillforts-topo-northwest-minus.png",
"hillforts-topo-northeast.png",
"hillforts-topo-south.png",
"hillforts-topo-south-plus.png",
"hillforts-topo-ireland.png",
"hillforts-topo-ireland-north.png",
"hillforts-topo-ireland-south.png"]
else:
backgrounds = ["hillforts-outline-01.png",
"hillforts-outline-north.png",
"hillforts-outline-northwest-plus.png",
"hillforts-outline-northwest-minus.png",
"hillforts-outline-northeast.png",
"hillforts-outline-south.png",
"hillforts-outline-south-plus.png",
"hillforts-outline-ireland.png",
"hillforts-outline-ireland-north.png",
"hillforts-outline-ireland-south.png"]
return backgrounds
def get_bounds():
bounds = [[-1200000,220000,6400000,8700000],
[-1200000,220000,7000000,8700000],
[-1200000,-480000,7000000,8200000],
[-900000,-480000,7100000,8200000],
[-520000, 0,7000000,8700000],
[-800000,220000,6400000,7100000],
[-1200000,220000,6400000,7100000],
[-1200000,-600000,6650000,7450000],
[-1200000,-600000,7050000,7450000],
[-1200000,-600000,6650000,7080000]]
return bounds
def show_background(plt, ax, location=""):
backgrounds = get_backgrounds()
bounds = get_bounds()
folder = "https://raw.githubusercontent.com/MikeDairsie/Hillforts-Primer/main/hillforts-topo/"
#"https://raw.githubusercontent.com/MikeDairsie/Hillforts-Primer/main/
#hillforts-topo/"
if location == "n":
background = os.path.join(folder, backgrounds[1])
bounds = bounds[1]
elif location == "nw+":
background = os.path.join(folder, backgrounds[2])
bounds = bounds[2]
elif location == "nw-":
background = os.path.join(folder, backgrounds[3])
bounds = bounds[3]
elif location == "ne":
background = os.path.join(folder, backgrounds[4])
bounds = bounds[4]
elif location == "s":
background = os.path.join(folder, backgrounds[5])
bounds = bounds[5]
elif location == "s+":
background = os.path.join(folder, backgrounds[6])
bounds = bounds[6]
elif location == "i":
background = os.path.join(folder, backgrounds[7])
bounds = bounds[7]
elif location == "in":
background = os.path.join(folder, backgrounds[8])
bounds = bounds[8]
elif location == "is":
background = os.path.join(folder, backgrounds[9])
bounds = bounds[9]
else:
background = os.path.join(folder, backgrounds[0])
bounds = bounds[0]
img = np.array(PIL.Image.open(urllib.request.urlopen(background)))
ax.imshow(img, extent=bounds)
def get_counts(data):
data_counts = []
for col in data.columns:
count = len(data[data[col] == 'Yes'])
data_counts.append(count)
return data_counts
def add_annotation_plot(ax):
ax.annotate("Middleton, M. 2024, Hillforts Primer", size='small', \
color='grey', xy=(0.01, 0.01), xycoords='figure fraction', \
horizontalalignment = 'left')
ax.annotate("Source Data: Lock & Ralston, 2017. hillforts.arch.ox.ac.uk", \
size='small', color='grey', xy=(0.99, 0.01), \
xycoords='figure fraction', horizontalalignment = 'right')
def add_annotation_l_xy(ax):
ax.annotate("Middleton, M. 2024, Hillforts Primer", size='small', \
color='grey', xy=(0.01, 0.035), xycoords='figure fraction', \
horizontalalignment = 'left')
ax.annotate("Source Data: Lock & Ralston, 2017. hillforts.arch.ox.ac.uk", \
size='small', color='grey', xy=(0.99, 0.035), \
xycoords='figure fraction', horizontalalignment = 'right')
def plot_bar_chart(data, split_pos, x_label, y_label, title):
fig = plt.figure(figsize=(12,5))
ax = fig.add_axes([0,0,1,1])
x_data = data.columns
x_data = [x.split("_")[split_pos:] for x in x_data]
x_data_new = []
for l in x_data :
txt = ""
for part in l:
txt += "_" + part
x_data_new.append(txt[1:])
y_data = get_counts(data)
ax.bar(x_data_new,y_data)
ax.set_xlabel(x_label)
ax.set_ylabel(y_label)
add_annotation_plot(ax)
plt.title(get_print_title(title))
save_fig(title)
plt.show()
def plot_bar_chart_using_two_tables(x_data, y_data, x_label, y_label, title):
fig = plt.figure(figsize=(12,5))
ax = fig.add_axes([0,0,1,1])
ax.bar(x_data,y_data)
ax.set_xlabel(x_label)
ax.set_ylabel(y_label)
add_annotation_plot(ax)
plt.title(get_print_title(title))
save_fig(title)
plt.show()
def plot_bar_chart_numeric(data, split_pos, x_label, y_label, title, n_bins):
new_data = data.copy()
fig = plt.figure(figsize=(12,5))
ax = fig.add_axes([0,0,1,1])
data[x_label].plot(kind='hist', bins = n_bins)
ax.set_xlabel(x_label)
ax.set_ylabel(y_label)
add_annotation_plot(ax)
plt.title(get_print_title(title))
save_fig(title)
plt.show()
def plot_bar_chart_value_counts(data, x_label, y_label, title):
fig = plt.figure(figsize=(12,5))
ax = fig.add_axes([0,0,1,1])
df = data.value_counts()
x_data = df.index.values
y_data = df.values
ax.bar(x_data,y_data)
ax.set_xlabel(x_label)
ax.set_ylabel(y_label)
add_annotation_plot(ax)
plt.title(get_print_title(title))
save_fig(title)
plt.show()
def get_bins(data, bins_count):
data_range = data.max() - data.min()
print(bins_count)
if bins_count != None:
x_bins = [x for x in range(data.min(), data.max(), bins_count)]
n_bins = len(x_bins)
else:
n_bins = int(data_range)
if n_bins < 10:
multi = 10
while n_bins< 10:
multi *= 10
n_bins = int(data_range * multi)
elif n_bins > 100:
n_bins = int(data_range)/10
return n_bins
def plot_histogram(data, x_label, title, bins_count = None):
n_bins = get_bins(data, bins_count)
fig = plt.figure(figsize=(12,5))
ax = fig.add_axes([0,0,1,1])
ax.set_xlabel(x_label)
ax.set_ylabel('Count')
plt.ticklabel_format(style='plain')
plt.hist(data, bins=n_bins)
plt.title(get_print_title(title))
add_annotation_plot(ax)
save_fig(title)
plt.show()
def plot_continuous(data, x_lable, title):
fig = plt.figure(figsize=(12,8))
ax = fig.add_axes([0,0,1,1])
ax.set_xlabel(x_lable)
plt.plot(data, linewidth=4)
plt.ticklabel_format(style='plain')
plt.title(get_print_title(title))
add_annotation_plot(ax)
save_fig(title)
plt.show()
# box plot
from matplotlib.cbook import boxplot_stats
def plot_data_range(data, feature, o="v"):
fig = plt.figure(figsize=(12,8))
ax = fig.add_axes([0,0,1,1])
ax.set_xlabel(feature)
add_annotation_plot(ax)
plt.title(get_print_title(feature + " Range"))
plt.ticklabel_format(style='plain')
if o == "v":
sns.boxplot(data=data, orient="v")
else:
sns.boxplot(data=data, orient="h")
save_fig(feature + " Range")
plt.show()
bp = boxplot_stats(data)
low = bp[0].get('whislo')
q1 = bp[0].get('q1')
median = bp[0].get('med')
q3 = bp[0].get('q3')
high = bp[0].get('whishi')
return [low, q1, median, q3, high]
def location_XY_plot():
plt.ticklabel_format(style='plain')
plt.xlim(-1200000,220000)
plt.ylim(6400000,8700000)
add_annotation_l_xy(plt)
def add_grey(region=''):
if show_topography == False:
# plots all the hillforts as a grey background
loc = location_data.copy()
if region == 's':
loc = loc[loc['Location_Y'] < 8000000].copy()
loc = loc[loc['Location_X'] > -710000].copy()
elif region == 'ne':
loc = loc[loc['Location_Y'] < 8000000].copy()
loc = loc[loc['Location_X'] > -800000].copy()
plt.scatter(loc['Location_X'], loc['Location_Y'], c='Silver')
def plot_over_grey_numeric(merged_data, a_type, title, extra="", inner=False, \
fringe=False, oxford=False,swindon=False):
plot_data = merged_data
fig, ax = plt.subplots(figsize=(14.2 * 0.66, 23.0 * 0.66))
show_background(plt, ax)
location_XY_plot()
add_grey()
patches = add_oxford_swindon(oxford,swindon)
plt.scatter(plot_data['Location_X'], plot_data['Location_Y'], c='Red')
if fringe:
f_for_legend = add_21Ha_fringe()
patches.append(f_for_legend)
if inner:
i_for_legend = add_21Ha_line()
patches.append(i_for_legend)
show_records(plt, plot_data)
plt.legend(loc='upper left', handles= patches)
plt.title(get_print_title(title))
save_fig(title)
plt.show()
def plot_over_grey_boundary(merged_data, a_type, boundary_type):
plot_data = merged_data[merged_data[a_type] == boundary_type]
fig, ax = plt.subplots(figsize=(9.47, 15.33))
show_background(plt, ax)
location_XY_plot()
add_grey(region='')
plt.scatter(plot_data['Location_X'], plot_data['Location_Y'], c='Red')
show_records(plt, plot_data)
plt.title(get_print_title('Boundary_Type: ' + boundary_type))
save_fig('Boundary_Type_' + boundary_type)
plt.show()
print(f'{round((len(plot_data)/len(merged_data)*100), 2)}%')
def plot_density_over_grey(data, data_type):
new_data = data.copy()
new_data = new_data.drop(['Density'], axis=1)
new_data = add_density(new_data)
fig, ax = plt.subplots(figsize=((14.2 * 0.66)+2.4, 23.0 * 0.66))
show_background(plt, ax)
location_XY_plot()
add_grey()
plt.scatter(new_data['Location_X'], new_data['Location_Y'], \
c=new_data['Density'], cmap=cm.rainbow, s=25)
plt.colorbar(label='Density')
plt.title(get_print_title(f'Density - {data_type}'))
save_fig(f'Density_{data_type}')
plt.show()
def add_21Ha_line():
x_values = \
[-367969, -344171, -263690, -194654, -130542, -119597, -162994, -265052]#, -304545]
y_values = \
[7019842, 6944572, 6850593, 6779602, 6735058, 6710127, 6684152, 6663609]#, 6611780]
plt.plot(x_values, y_values, 'k', ls='-', lw=15, alpha=0.25, \
label = '≥ 21 Ha Line')
add_to_legend = \
Line2D([0], [0], color='k', lw=15, alpha=0.25, label = '≥ 21 Ha Line')
return add_to_legend
def add_21Ha_fringe():
x_values = \
[-367969,-126771,29679,-42657,-248650,-304545,-423647,-584307,-367969]
y_values = \
[7019842,6847138,6671658,6596650,6554366,6611780,6662041,6752378,7019842]
plt.plot(x_values, y_values, 'k', ls=':', lw=5, alpha=0.45, \
label = '≥ 21 Ha Fringe')
add_to_legend = \
Line2D([0], [0], color='k', ls=':', lw=5, alpha=0.45, label = '≥ 21 Ha Fringe')
return add_to_legend
def add_oxford_swindon(oxford=False,swindon=False):
# plots a circle over Swindon & Oxford
radius = 50
marker_size = (2*radius)**2
patches = []
if oxford:
plt.scatter(-144362,6758380, c='dodgerblue', s=marker_size, alpha=0.50)
b_patch = mpatches.Patch(color='dodgerblue', label='Oxford orbit')
patches.append(b_patch)
if swindon:
plt.scatter(-197416, 6721977, c='yellow', s=marker_size, alpha=0.50)
y_patch = mpatches.Patch(color='yellow', label='Swindon orbit')
patches.append(y_patch)
return patches
def plot_over_grey(merged_data, a_type, yes_no, extra="", \
inner=False, fringe=False, oxford=False,swindon=False):
# plots selected data over the grey dots. yes_no controlls filtering
#the data for a positive or negative values.
plot_data = merged_data[merged_data[a_type] == yes_no]
fig, ax = plt.subplots(figsize=(14.2 * 0.66, 23.0 * 0.66))
show_background(plt, ax)
location_XY_plot()
add_grey()
patches = add_oxford_swindon(oxford,swindon)
plt.scatter(plot_data['Location_X'], plot_data['Location_Y'], c='Red')
if fringe:
f_for_legend = add_21Ha_fringe()
patches.append(f_for_legend)
if inner:
i_for_legend = add_21Ha_line()
patches.append(i_for_legend)
show_records(plt, plot_data)
plt.legend(loc='upper left', handles= patches)
plt.title(get_print_title(f'{a_type} {extra}'))
save_fig(f'{a_type}_{extra}')
plt.show()
print(f'{round((len(plot_data)/len(merged_data)*100), 2)}%')
return plot_data
def plot_type_values(data, data_type, title):
new_data = data.copy()
fig, ax = plt.subplots(figsize=((14.2 * 0.66)+2.4, 23.0 * 0.66))
show_background(plt, ax)
location_XY_plot()
plt.scatter(new_data['Location_X'], new_data['Location_Y'], \
c=new_data[data_type], cmap=cm.rainbow, s=25)
plt.colorbar(label=data_type)
plt.title(get_print_title(title))
save_fig(title)
plt.show()
def bespoke_plot(plt, title):
add_annotation_plot(plt)
plt.ticklabel_format(style='plain')
plt.title(get_print_title(title))
save_fig(title)
plt.show()
def get_proportions(date_set):
total = sum(date_set) - date_set[-1]
newset = []
for entry in date_set[:-1]:
newset.append(round(entry/total,2))
return newset
def plot_dates_by_region(nw,ne,ni,si,s, features):
fig = plt.figure(figsize=(12,5))
ax = fig.add_axes([0,0,1,1])
x_data = nw[features].columns
x_data = [x.split("_")[2:] for x in x_data][:-1]
x_data_new = []
for l in x_data:
txt = ""
for part in l:
txt += "_" + part
x_data_new.append(txt[1:])
set1_name = 'NW'
set2_name = 'NE'
set3_name = 'N Ireland'
set4_name = 'S Ireland'
set5_name = 'South'
set1 = get_proportions(get_counts(nw[features]))
set2 = get_proportions(get_counts(ne[features]))
set3 = get_proportions(get_counts(ni[features]))
set4 = get_proportions(get_counts(si[features]))
set5 = get_proportions(get_counts(s[features]))
X_axis = np.arange(len(x_data_new))
budge = 0.25
plt.bar(X_axis - 0.55 + budge, set1, 0.3, label = set1_name)
plt.bar(X_axis - 0.4 + budge, set2, 0.3, label = set2_name)
plt.bar(X_axis - 0.25 + budge, set3, 0.3, label = set3_name)
plt.bar(X_axis - 0.1 + budge, set4, 0.3, label = set4_name)
plt.bar(X_axis + 0.05 + budge, set5, 0.3, label = set5_name)
plt.xticks(X_axis, x_data_new)
plt.xlabel('Dating')
plt.ylabel('Proportion of Total Dated Hillforts in Region')
title = 'Proportions of Dated Hillforts by Region'
plt.title(title)
plt.legend()
add_annotation_plot(ax)
save_fig(title)
plt.show()
The following functions will be used to confirm that features are not lost or forgotten when splitting the data.
def test_numeric(data):
temp_data = data.copy()
columns = data.columns
out_cols = ['Feature','Entries', 'Numeric', 'Non-Numeric', 'Null']
feat, ent, num, non, nul = [],[],[],[],[]
for col in columns:
if temp_data[col].dtype == 'object':
feat.append(col)
temp_data[col+'_num'] = temp_data[col].str.isnumeric()
entries = temp_data[col].notnull().sum()
true_count = temp_data[col+'_num'][temp_data[col+'_num'] == True].sum()
null_count = temp_data[col].isna().sum()
ent.append(entries)
num.append(true_count)
non.append(entries-true_count)
nul.append(null_count)
else:
print(f'{col} {temp_data[col].dtype}')
summary = pd.DataFrame(list(zip(feat, ent, num, non, nul)))
summary.columns = out_cols
return summary
def find_duplicated(numeric_data, text_data, encodeable_data):
d = False
all_columns = \
list(numeric_data.columns) + list(text_data.columns) + \
list(encodeable_data.columns)
duplicate = \
[item for item, count in collections.Counter(all_columns).items() \
if count > 1]
if duplicate :
print(f"There are duplicate features: {duplicate}")
d = True
return d
def test_data_split(main_data, numeric_data, text_data, encodeable_data):
m = False
split_features = \
list(numeric_data.columns) + list(text_data.columns) + \
list(encodeable_data.columns)
missing = list(set(main_data)-set(split_features))
if missing:
print(f"There are missing features: {missing}")
m = True
return m
def review_data_split(main_data, numeric_data, text_data, \
encodeable_data = pd.DataFrame()):
d = find_duplicated(numeric_data, text_data, encodeable_data)
m = test_data_split(main_data, numeric_data, text_data, encodeable_data)
if d != True and m != True:
print("Data split good.")
def find_duplicates(data):
print(f'{data.count() - data.duplicated(keep=False).count()} duplicates.')
def count_yes(data):
total = 0
for col in data.columns:
count = len(data[data[col] == 'Yes'])
total+= count
print(f'{col}: {count}')
print(f'Total yes count: {total}')
The following functions will be used to update null values.
def fill_nan_with_minus_one(data, feature):
new_data = data.copy()
new_data[feature] = data[feature].fillna(-1)
return new_data
def fill_nan_with_NA(data, feature):
new_data = data.copy()
new_data[feature] = data[feature].fillna("NA")
return new_data
def test_numeric_value_in_feature(feature, value):
test = feature.isin([-1]).sum()
return test
def test_catagorical_value_in_feature(dataframe, feature, value):
test = dataframe[feature][dataframe[feature] == value].count()
return test
def test_cat_list_for_NA(dataframe, cat_list):
for val in cat_list:
print(val, test_catagorical_value_in_feature(dataframe, val,'NA'))
def test_num_list_for_minus_one(dataframe, num_list):
for val in num_list:
feature = dataframe[val]
print(val, test_numeric_value_in_feature(feature, -1))
def update_cat_list_for_NA(dataframe, cat_list):
new_data = dataframe.copy()
for val in cat_list:
new_data = fill_nan_with_NA(new_data, val)
return new_data
def update_num_list_for_minus_one(dataframe, cat_list):
new_data = dataframe.copy()
for val in cat_list:
new_data = fill_nan_with_minus_one(new_data, val)
return new_data
def add_density(data):
new_data = data.copy()
xy = np.vstack([new_data['Location_X'],new_data['Location_Y']])
new_data['Density'] = gaussian_kde(xy)(xy)
return new_data
fig_no = 0
part = 'Appendix02'
IMAGES_PATH = r'/content/drive/My Drive/'
fig_list = pd.DataFrame(columns=['fig_no', 'file_name', 'title'])
topo_txt = ""
if show_topography:
topo_txt = "-topo"
def get_file_name(title):
file_name = slugify(title)
return file_name
def get_print_title(title):
title = title.replace("_", " ")
title = title.replace("-", " ")
title = title.replace(",", ";")
return title
def format_figno(no):
length = len(str(no))
fig_no = ''
for i in range(3-length):
fig_no = fig_no + '0'
fig_no = fig_no + str(no)
return fig_no
if save_images == True:
drive.mount('/content/drive')
os.getcwd()
else:
pass
def save_fig(fig_id, tight_layout=True, fig_extension="png", resolution=300):
global fig_no
global IMAGES_PATH
if save_images:
#IMAGES_PATH = r'/content/drive/My Drive/Colab Notebooks/Hillforts_Primer_Images/HP_Appendix_02_images/'
fig_no+=1
fig_no_txt = format_figno(fig_no)
file_name = file_name = get_file_name(f'{part}_{fig_no_txt}')
file_name = f'hillforts_primer_{file_name}{topo_txt}.{fig_extension}'
fig_list.loc[len(fig_list)] = [fig_no, file_name, get_print_title(fig_id)]
path = os.path.join(IMAGES_PATH, file_name)
print("Saving figure", file_name)
plt.tight_layout()
plt.savefig(path, format=fig_extension, dpi=resolution, bbox_inches='tight')
else:
pass
The source csv file is loaded and the first two rows are displayed to confirm the load was successful. Note that, to the left, an index has been added automatically. This index will be used frequently when splitting and remerging data extracts.
hillforts_csv = r"https://raw.githubusercontent.com/MikeDairsie/Hillforts-Primer/main/hillforts-atlas-source-data-csv/hillforts.csv"
#r"https://raw.githubusercontent.com/MikeDairsie/Hillforts-Primer/main/
#hillforts-atlas-source-data-csv/hillforts.csv"
hillforts_data = pd.read_csv(hillforts_csv, index_col=False)
pd.set_option('display.max_columns', None, 'display.max_rows', None)
hillforts_data.head(2)
<ipython-input-56-03f8272c1daf>:4: DtypeWarning: Columns (10,12,68,83,84,85,86,165,183) have mixed types. Specify dtype option on import or set low_memory=False. hillforts_data = pd.read_csv(hillforts_csv, index_col=False)
| OBJECTID | Main_Atlas_Number | Main_Country_Code | Main_Country | Main_Title_Name | Main_Site_Name | Main_Alt_Name | Main_Display_Name | Main_HER | Main_HER_PRN | Main_HER_ID | Main_NMR_Mapsheet | Main_NMR_ID | Main_SM | Main_Summary | Main_Boundary | Main_Coordinate_System | Main_X | Main_Y | Status_Citizen_Science | Status_Citizen | Status_Data_Reliability | Status_Data_Comments | Status_Interpretation_Reliability | Status_Interpretation_Comments | Location_NGR | Location_X | Location_Y | Location_Longitude | Location_Latitude | Location_Current_County | Location_Historic_County | Location_Current_Parish | Management_Condition_Extant | Management_Condition_Cropmark | Management_Condition_Destroyed | Management_Condition_Comments | Management_Land_Use_Woodland | Management_Land_Use_Plantation | Management_Land_Use_Parkland | Management_Land_Use_Pasture | Management_Land_Use_Arable | Management_Land_Use_Scrub | Management_Land_Use_Outcrop | Management_Land_Use_Moorland | Management_Land_Use_Heath | Management_Land_Use_Urban | Management_Land_Use_Coastal | Management_Land_Use_Other | Management_Land_Use_Comments | Landscape_Type_Contour | Landscape_Type_Partial | Landscape_Type_Promontory | Landscape_Type_Hillslope | Landscape_Type_Level | Landscape_Type_Marsh | Landscape_Type_Multiple | Landscape_Type_Comments | Landscape_Topography_Hilltop | Landscape_Topography_Coastal | Landscape_Topography_Inland | Landscape_Topography_Valley | Landscape_Topography_Knoll | Landscape_Topography_Ridge | Landscape_Topography_Scarp | Landscape_Topography_Hillslope | Landscape_Topography_Lowland | Landscape_Topography_Spur | Landscape_Topography_Comments | Landscape_Topography_Dominant | Landscape_Aspect_N | Landscape_Aspect_NE | Landscape_Aspect_E | Landscape_Aspect_SE | Landscape_Aspect_S | Landscape_Aspect_SW | Landscape_Aspect_W | Landscape_Aspect_NW | Landscape_Aspect_Level | Landscape_Altitude | Boundary_Boundary_Type | Boundary_Boundary_Comments | Boundary_Country_Code_2 | Boundary_HER_2 | Boundary_HER_PRN_2 | Boundary_Current_County_2 | Boundary_Historic_County_2 | Boundary_Current_Parish_2 | Dating_Date_Pre_1200BC | Dating_Date_1200BC_800BC | Dating_Date_800BC_400BC | Dating_Date_400BC_AD50 | Dating_Date_AD50_AD400 | Dating_Date_AD400_AD800 | Dating_Date_Post_AD800 | Dating_Date_Unknown | Dating_Date_Reliability | Dating_Date_Comments | Dating_Pre | Dating_Pre_Comments | Dating_Post | Dating_Post_Comments | Investigations_Summary | Interior_Summary | Interior_Water_None | Interior_Water_Spring | Interior_Water_Stream | Interior_Water_Pool | Interior_Water_Flush | Interior_Water_Well | Interior_Water_Other | Interior_Water_Comments | Interior_Surface_None | Interior_Surface_Round | Interior_Surface_Rectangular | Interior_Surface_Curvilinear | Interior_Surface_Roundhouse | Interior_Surface_Pit | Interior_Surface_Quarry | Interior_Surface_Other | Interior_Surface_Comments | Interior_Excavation_None | Interior_Excavation_Pit | Interior_Excavation_Posthole | Interior_Excavation_Roundhouse | Interior_Excavation_Rectangular | Interior_Excavation_Road | Interior_Excavation_Quarry | Interior_Excavation_Other | Interior_Excavation_Nothing | Interior_Excavation_Comments | Interior_Geophysics_None | Interior_Geophysics_Pit | Interior_Geophysics_Roundhouse | Interior_Geophysics_Rectangular | Interior_Geophysics_Road | Interior_Geophysics_Quarry | Interior_Geophysics_Other | Interior_Geophysics_Nothing | Interior_Geophysics_Comments | Interior_Finds_None | Interior_Finds_Pottery | Interior_Finds_Metal | Interior_Finds_Metalworking | Interior_Finds_Human | Interior_Finds_Animal | Interior_Finds_Lithics | Interior_Finds_Evironmental | Interior_Finds_Other | Interior_Finds_Comments | Interior_Aerial_Unchecked | Interior_Aerial_None | Interior_Aerial_Roundhouse | Interior_Aerial_Rectangular | Interior_Aerial_Pit | Interior_Aerial_Posthole | Interior_Aerial_Road | Interior_Aerial_Other | Interior_Aerial_Comments | Entrances_Breaks | Entrances_Breaks_Comments | Entrances_Original | Entrances_Original_Comments | Entrances_Guard_Chambers | Entrances_Chevaux | Entrances_Chevaux_Comments | Entrances_Summary | Enclosing_Summary | Enclosing_Area_1 | Enclosing_Area_2 | Enclosing_Area_3 | Enclosing_Area_4 | Enclosing_Enclosed_Area | Enclosing_Area | Enclosing_Multiperiod | Enclosing_Multiperiod_Comments | Enclosing_Circuit | Enclosing_Circuit_Comments | Enclosing_Max_Ramparts | Enclosing_NE_Quadrant | Enclosing_SE_Quadrant | Enclosing_SW_Quadrant | Enclosing_NW_Quadrant | Enclosing_Quadrant_Comments | Enclosing_Current_Part_Uni | Enclosing_Current_Uni | Enclosing_Current_Part_Bi | Enclosing_Current_Bi | Enclosing_Current_Part_Multi | Enclosing_Current_Multi | Enclosing_Current_Unknown | Enclosing_Period_Part_Uni | Enclosing_Period_Uni | Enclosing_Period_Part_Bi | Enclosing_Period_Bi | Enclosing_Period_Part_Multi | Enclosing_Period_Multi | Enclosing_Surface_None | Enclosing_Surface_Bank | Enclosing_Surface_Wall | Enclosing_Surface_Rubble | Enclosing_Surface_Walk | Enclosing_Surface_Timber | Enclosing_Surface_Vitrification | Enclosing_Surface_Burning | Enclosing_Surface_Palisade | Enclosing_Surface_Counter_Scarp | Enclosing_Surface_Berm | Enclosing_Surface_Unfinished | Enclosing_Surface_Other | Enclosing_Surface_Comments | Enclosing_Excavation_Nothing | Enclosing_Excavation_Bank | Enclosing_Excavation_Wall | Enclosing_Excavation_Murus | Enclosing_Excavation_Timber_Framed | Enclosing_Excavation_Timber_Laced | Enclosing_Excavation_Vitrification | Enclosing_Excavation_Burning | Enclosing_Excavation_Palisade | Enclosing_Excavation_Counter_Scarp | Enclosing_Excavation_Berm | Enclosing_Excavation_Unfinished | Enclosing_Excavation_No_Known | Enclosing_Excavation_Other | Enclosing_Excavation_Comments | Enclosing_Gang_Working | Enclosing_Gang_Working_Comments | Enclosing_Ditches | Enclosing_Ditches_Number | Enclosing_Ditches_Comments | Annex | Annex_Summary | References | URL_Atlas | URL_Wiki | URL_NMR_Resource | NMR_URL | URL_HER_Resource | URL_HER | Related_Dating_Evidence | Related_Investigations | Related_Entrances | Record_URL | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 1 | EN | England | EN0001 Aconbury Camp, Herefordshire | Aconbury Camp | Aconbury Beacon | Aconbury Camp, Herefordshire (Aconbury Beacon) | Herefordshire | MHE413 | 910 | SO 53 SW 1 | 110371 | 1001754 | Large, wooded, univallate, partial contour hil... | No | OSGB36 | 350350 | 233050 | No | NaN | Confirmed | NaN | Confirmed | NaN | SO 503330 | -303295 | 6798973 | -2.724548 | 51.993628 | Herefordshire | Herefordshire | Aconbury | Yes | No | No | Main ditch gone on N and W sides. Visitor eros... | Yes | No | No | No | No | Yes | No | No | No | No | No | Yes | Mixed woodland since 19th century with interna... | No | Yes | Yes | No | No | No | No | Partial contour fort following the natural con... | Yes | No | Yes | No | No | No | No | No | No | No | NaN | Hill top, part promontory. | No | No | No | No | No | No | Yes | No | No | 276.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | No | No | Yes | Yes | Yes | No | No | No | C - Low | The finding of Iron Age and Roman pottery sugg... | No | NaN | Yes | Evidence of Civil War occupation and possible ... | In Aubrey's Monumenta Britannica (1665-1693). ... | Little information about interior was gleaned ... | Yes | No | No | No | No | No | No | Spring 0.3km located outside the hillfort | No | No | No | No | No | No | Yes | No | Little information is available from surface e... | Yes | No | No | No | No | No | No | No | No | NaN | Yes | No | No | No | No | No | No | No | NaN | No | Yes | No | No | No | No | No | No | No | Quantity of Iron Age sherds similar to those f... | Yes | No | No | No | No | No | No | No | NaN | 6.0 | Two original and four modern gaps. | 2.0 | Two original inturned entrances at SE and SW c... | No | No | NaN | Two original entrances; the SE inturned. The S... | Univallate hillfort with complete circuit, but... | 7.1 | NaN | NaN | NaN | 7.1 | 9.3 | No | Univallate hillfort with complete circuit. | Yes | Single rampart continues around circuit. | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | NaN | No | Yes | No | No | No | No | No | No | No | No | No | No | No | No | Yes | No | No | No | No | No | No | No | No | Yes | No | Yes | Little surface evidence of features and the ba... | No | No | No | No | No | No | No | No | No | No | No | No | No | No | NaN | No | NaN | Yes | 1.0 | Main ditch only present on the S and E sides, ... | No | NaN | Dorling, P. and Wigley, A. 2012. Assessment of... | https://hillforts.arch.ox.ac.uk/?query=Atlas_o... | http://www.wikidata.org/entity/Q31113987 | NaN | NaN | NaN | NaN | Artefactual | 1st Identified Map Depiction (1888); Other (19... | In-turned (South east); In-turned (South west)... | http://hillforts.arch.ox.ac.uk/records/EN0001.... |
| 1 | 2 | 2 | EN | England | EN0002 Bach Camp, Herefordshire | Bach Camp | NaN | Bach Camp, Herefordshire | Herefordshire | MHE52 | 344 | SO 56 SW 3 | 110884 | 1007316 | Univallate, contour hillfort located on summit... | No | OSGB36 | 354700 | 260200 | No | NaN | Confirmed | NaN | Confirmed | NaN | SO 547602 | -296646 | 6843289 | -2.664819 | 52.238082 | Herefordshire | Herefordshire | Kimbolton | Yes | No | No | Natural and animal erosion with sheep scrapes.... | No | No | No | Yes | No | Yes | No | No | No | No | No | No | Potatoes once grown on the site, but vegetatio... | Yes | No | No | No | No | No | No | Univallate, contour hillfort located on summit... | Yes | No | No | No | No | No | No | No | No | Yes | NaN | Hill top spur. | No | No | No | No | No | No | No | No | Yes | 150.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | No | No | No | No | No | No | No | Yes | D - None | None | No | NaN | No | NaN | On 1st Ed. OS map (1888). Herefordshire Aerial... | None | Yes | No | No | No | No | No | No | Stream 0.1km located outside hillfort | Yes | No | No | No | No | No | No | No | NaN | Yes | No | No | No | No | No | No | No | No | NaN | Yes | No | No | No | No | No | No | No | NaN | Yes | No | No | No | No | No | No | No | No | NaN | Yes | No | No | No | No | No | No | No | NaN | 3.0 | N entrance damaged by wagon access and possibl... | 2.0 | S entrance original, that on the NW possibly ... | No | No | NaN | Entrances difficult to unravel. The S entrance... | Defined differentially by single rampart to 5.... | 4.1 | NaN | NaN | NaN | 4.1 | NaN | No | NaN | Yes | The ramparts are irregular which makes assessm... | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | NaN | No | Yes | No | No | No | No | No | No | No | No | No | No | No | No | Yes | No | No | No | No | No | No | No | Yes | Yes | Yes | No | Bank possibly earthen. Counterscarp bank compl... | No | No | No | No | No | No | No | No | No | No | No | No | Yes | No | None | No | NaN | Yes | 1.0 | NaN | No | NaN | Dorling, P. and Wigley, A. 2012. Assessment of... | https://hillforts.arch.ox.ac.uk/?query=Atlas_o... | http://www.wikidata.org/entity/Q31113996 | NaN | NaN | NaN | NaN | NaN | 1st Identified Map Depiction (1888); Other (20... | In-turned (South); Simple Gap (North west); Ho... | http://hillforts.arch.ox.ac.uk/records/EN0002.... |
If confirmed_only is set to True in User Settings above, this will filter the source data so that it contains only confirmed forts.
if confirmed_only == True:
hillforts_data = \
hillforts_data[hillforts_data['Status_Interpretation_Reliability'] == \
'Confirmed']
print(f'Data filtered to contain only {len(hillforts_data)} confirmed hillforts.')
else:
print(f'Using all {len(hillforts_data)} record in the Hillforts Atlas.')
Using all 4147 record in the Hillforts Atlas.
from google.colab import files
def download(data_list, filename, hf_data=hillforts_data):
if download_data == True:
name_and_number = hf_data[['Main_Atlas_Number','Main_Display_Name']].copy()
dl = name_and_number.copy()
for pkg in data_list:
if filename not in ['england', 'wales','scotland', \
'republic-of-ireland','norhtern-ireland', \
'isle-of-man', 'roi-ni', 'eng-wal-sco-iom']:
if pkg.shape[0] == hillforts_data.shape[0]:
dl = pd.merge(dl, pkg, left_index=True, right_index=True)
else:
dl = data_list[0]
dl = dl.replace('\r',' ', regex=True)
dl = dl.replace('\n',' ', regex=True)
fn = 'hillforts_primer_' + filename
fn = get_file_name(fn)
dl.to_csv(fn+'.csv', index=False)
files.download(fn+'.csv')
else:
pass
The Main Atlas Number and the Main Display Name are the primary unique reference identifiers in the data. With these, users can identify any record numerically and by name. Throughout this document, the data will be clipped into a number of sub-data packages. Where needed, these data extracts will be combined with Name and Number features to ensure the data can be understood and can, if needed, be concorded.
name_and_number_features = ['Main_Atlas_Number','Main_Display_Name']
name_and_number = hillforts_data[name_and_number_features].copy()
name_and_number.head()
| Main_Atlas_Number | Main_Display_Name | |
|---|---|---|
| 0 | 1 | Aconbury Camp, Herefordshire (Aconbury Beacon) |
| 1 | 2 | Bach Camp, Herefordshire |
| 2 | 3 | Backbury Camp, Herefordshire (Ethelbert's Camp) |
| 3 | 4 | Brandon Camp, Herefordshire |
| 4 | 5 | British Camp, Herefordshire (Herefordshire Bea... |
location_numeric_data_short_features = ['Location_X','Location_Y']
location_numeric_data_short = hillforts_data[location_numeric_data_short_features]
location_numeric_data_short = add_density(location_numeric_data_short)
location_numeric_data_short.head()
location_data = location_numeric_data_short.copy()
location_data.head()
| Location_X | Location_Y | Density | |
|---|---|---|---|
| 0 | -303295 | 6798973 | 1.632859e-12 |
| 1 | -296646 | 6843289 | 1.540172e-12 |
| 2 | -289837 | 6808611 | 1.547729e-12 |
| 3 | -320850 | 6862993 | 1.670548e-12 |
| 4 | -261765 | 6810587 | 1.369981e-12 |
cluster_data = \
hillforts_data[['Location_X','Location_Y', 'Main_Country_Code']].copy()
cluster_data['Cluster'] = 'NA'
cluster_data['Cluster'].where(cluster_data['Main_Country_Code'] != \
'NI', 'I', inplace=True)
cluster_data['Cluster'].where(cluster_data['Main_Country_Code'] != \
'IR', 'I', inplace=True)
cluster_data['Cluster'] = np.where(
(cluster_data['Cluster'] == 'I') & (cluster_data['Location_Y'] >= 7060000) , \
'North Ireland', cluster_data['Cluster'])
north_ireland = cluster_data[cluster_data['Cluster'] == 'North Ireland'].copy()
cluster_data['Cluster'] = np.where(
(cluster_data['Cluster'] == 'I') & (cluster_data['Location_Y'] < 7060000) , \
'South Ireland', cluster_data['Cluster'])
south_ireland = cluster_data[cluster_data['Cluster'] == 'South Ireland'].copy()
cluster_data['Cluster'] = np.where(
(cluster_data['Cluster'] == 'NA') & (cluster_data['Location_Y'] < 7070000) , \
'South', cluster_data['Cluster'])
south = cluster_data[cluster_data['Cluster'] == 'South'].copy()
cluster_data['Cluster'] = np.where(
(cluster_data['Cluster'] == 'NA') & (cluster_data['Location_Y'] >= 7070000) & \
(cluster_data['Location_X'] >= -500000), 'Northeast', cluster_data['Cluster'])
north_east = cluster_data[cluster_data['Cluster'] == 'Northeast'].copy()
cluster_data['Cluster'] = np.where(
(cluster_data['Cluster'] == 'NA') & (cluster_data['Location_Y'] >= 7070000) & \
(cluster_data['Location_X'] < -500000), 'Northwest', cluster_data['Cluster'])
north_west = cluster_data[cluster_data['Cluster'] == 'Northwest'].copy()
temp_cluster_location_packages = \
[north_ireland, south_ireland, south, north_east, north_west]
cluster_packages = []
for pkg in temp_cluster_location_packages:
pkg = pkg.drop(['Main_Country_Code'], axis=1)
cluster_packages.append(pkg)
north_ireland, south_ireland, south, north_east, north_west = \
cluster_packages[0], cluster_packages[1], cluster_packages[2], \
cluster_packages[3], cluster_packages[4]
def renew_density(data):
new_data = data.copy()
try:
new_data = new_data.drop(['Density'], axis=1)
except:
pass
try:
new_data = new_data.drop(['Density_trans'], axis=1)
except:
pass
xy = np.vstack([new_data['Location_X'],new_data['Location_Y']])
new_data['Density'] = gaussian_kde(xy)(xy)
new_data['Density_trans'] = stats.boxcox(new_data['Density'], 0.5)
return new_data
north_west = renew_density(north_west)
def plot_north_west_dnesity_minus(show_eildon, cluster_north_west, \
location_X_cluster_north_west, \
location_Y_cluster_north_west):
fig, ax = plt.subplots(figsize=((2.772 * 1.75)+2.0, (7.26 * 1.75)))
show_background(plt, ax, 'nw-')
plt.ticklabel_format(style='plain')
plt.xlim(-900000,-480000)
plt.ylim(7100000,8200000)
plt.scatter(cluster_north_west['Location_X'], \
cluster_north_west['Location_Y'], \
c=cluster_north_west['Density_trans'], cmap=cm.rainbow, s=25)
rect3, rect4, vline2, xline2 = get_rectangles(location_X_cluster_north_west, \
location_Y_cluster_north_west, \
3, 1.5)
ax.add_patch(rect3)
ax.add_patch(rect4)
plt.vlines(x=[vline2[1]], ymin=vline2[0], ymax=vline2[2], colors='purple', \
ls='--', lw=1.5)
plt.hlines(y=[xline2[1]], xmin=xline2[0], xmax=xline2[2], colors='purple', \
ls='--', lw=1.5)
plt.plot(-609918,7575512 ,'ko')
if show_eildon:
ax.annotate('SC2466: Dunadd', xy=(-609918,7575512), xycoords='data', \
ha='left', xytext=(-110, -135), textcoords='offset points', \
arrowprops=dict(arrowstyle="->", color='k', lw=1, \
connectionstyle="arc3,rad=-0.2"))
plt.colorbar(label='Density Transformed')
add_annotation_plot(plt)
title = 'Density Trans & Boxplot (NW minus Ireland)'
plt.title(get_print_title(title))
save_fig(title)
plt.show()
def plot_data_range_plus(data, feature, o=None):
if o == None:
pass
else:
fig = plt.figure(figsize=(12,8))
ax = fig.add_axes([0,0,1,1])
ax.set_xlabel(feature)
add_annotation_plot(ax)
plt.title(get_print_title(feature + " Range"))
plt.ticklabel_format(style='plain')
if o == "v":
sns.boxplot(data=data, orient="v")
elif o == None:
pass
else:
sns.boxplot(data=data, orient="h")
if o == None:
pass
else:
save_fig(feature + " Range")
plt.show()
bp = boxplot_stats(data)
low = bp[0].get('whislo')
q1 = bp[0].get('q1')
median = bp[0].get('med')
q3 = bp[0].get('q3')
high = bp[0].get('whishi')
return [low, q1, median, q3, high]
def get_rectangles(x_data, y_data, w1, w2):
x_lo, x_q1, x_median, x_q3, x_hi = x_data[0], x_data[1], x_data[2], \
x_data[3], x_data[4]
y_lo, y_q1, y_median, y_q3, y_hi = y_data[0], y_data[1], y_data[2], \
y_data[3], y_data[4]
rect = patches.Rectangle((x_lo, y_lo), x_hi-x_lo, y_hi-y_lo, linewidth=w2, \
edgecolor='k', facecolor='none')
rect2 = patches.Rectangle((x_q1, y_q1), x_q3-x_q1, y_q3-y_q1, linewidth=w1, \
edgecolor='r', facecolor='none')
return rect, rect2, [y_q1, x_median, y_q3], [x_q1, y_median, x_q3]
def plot_north_west_density_for_column(show_eildon, regional_data, cluster, \
column, location_X_cluster, \
location_Y_cluster, show_boxes = False):
fig, ax = plt.subplots(figsize=((2.772 * 1.75)+2.0, (7.26 * 1.75)))
show_background(plt, ax, 'nw-')
plt.ticklabel_format(style='plain')
plt.xlim(-900000,-480000)
plt.ylim(7100000,8200000)
plt.scatter(cluster['Location_X'], \
cluster['Location_Y'], \
c=cluster['Density_trans'], cmap=cm.rainbow, s=25)
if show_boxes:
rect3, rect4, vline2, xline2 = get_rectangles(location_X_cluster, \
location_Y_cluster, \
3, 1.5)
ax.add_patch(rect3)
ax.add_patch(rect4)
plt.vlines(x=[vline2[1]], ymin=vline2[0], ymax=vline2[2], \
colors='purple', ls='--', lw=1.5)
plt.hlines(y=[xline2[1]], xmin=xline2[0], xmax=xline2[2], \
colors='purple', ls='--', lw=1.5)
plt.plot(-609918,7575512 ,'ko')
if show_eildon:
ax.annotate('SC2466: Dunadd', xy=(-609918,7575512), xycoords='data', \
ha='left', xytext=(-110, -135), textcoords='offset points', \
arrowprops=dict(arrowstyle="->", color='k', lw=1, \
connectionstyle="arc3,rad=-0.2"))
plt.colorbar(label='Density Transformed')
add_annotation_plot(plt)
title = column
text_colour = 'k'
plt.annotate(str(round((len(cluster) / len(regional_data)) * 100, 1))+ "%", \
xy=(-880000, 7150000), xycoords='data', ha='left', \
color=text_colour)
plt.annotate(str(len(cluster))+'/'+str(len(regional_data)), \
xy=(-880000, 7120000), xycoords='data', ha='left', \
color=text_colour)
plt.title(get_print_title(title))
save_fig("NW_" + title)
plt.show()
def plot_encodeable(hillforts_data, north_west, encodeable_features, show_percentage):
encodeable_data = hillforts_data[encodeable_features].copy()
location_data = pd.merge(north_west, encodeable_data, \
left_index=True, right_index=True)
print(len(location_data))
for column in encodeable_features:
# Filter data
cluster = location_data[location_data[column] == "Yes"]
show_eildon = False
if show_percentage < 10:
show_percentage = 10
if (len(cluster) / len(location_data)) * 100 > show_percentage:
# refresh density
cluster = renew_density(cluster)
location_X_cluster = \
plot_data_range_plus(cluster['Location_X'], '', None)
location_Y_cluster = \
plot_data_range_plus(cluster['Location_Y'], '', None)
#print(len(cluster))
plot_north_west_density_for_column(show_eildon, location_data, \
cluster, column, location_X_cluster, \
location_Y_cluster, False)
def plot_altitude(hillforts_data, north_west, single_data_set, single_column, altitude, show_percentage):
# Filter data
location_data = pd.merge(north_west, single_data_set, \
left_index=True, right_index=True)
print(len(location_data))
new_col_name = "Landscape_Altitude" + "_" + altitude
location_data.columns = ["Location_X","Location_Y","Cluster","Density","Density_trans", "Location_X_y","Location_Y_y","Density_y",new_col_name]
cluster = location_data[location_data[new_col_name] == "Yes"]
show_eildon = False
if show_percentage < 10:
show_percentage = 10
if (len(cluster) / len(north_west)) * 100 > show_percentage:
# refresh density
cluster = renew_density(cluster)
location_X_cluster = \
plot_data_range_plus(cluster['Location_X'], '', None)
location_Y_cluster = \
plot_data_range_plus(cluster['Location_Y'], '', None)
#print(len(cluster))
plot_north_west_density_for_column(show_eildon, north_west, \
cluster, new_col_name, location_X_cluster, \
location_Y_cluster, False)
def plot_numeric(hillforts_data, north_west, numeric_features, show_percentage):
numeric_data = hillforts_data[numeric_features].copy()
location_data = pd.merge(north_west, numeric_data, \
left_index=True, right_index=True)
print(len(location_data))
for column in numeric_features:
unique_values = sorted(location_data[column ].dropna().unique())
print(unique_values)
for val in unique_values:
#Filter data
cluster = location_data[location_data[column] == val]
new_col_name = column + "_" + str(val)
cluster = cluster.rename(columns={column: new_col_name})
show_eildon = False
if show_percentage < 10:
show_percentage = 10
if (len(cluster) / len(location_data)) * 100 > show_percentage:
# refresh density
cluster = renew_density(cluster)
location_X_cluster = \
plot_data_range_plus(cluster['Location_X'], '', None)
location_Y_cluster = \
plot_data_range_plus(cluster['Location_Y'], '', None)
#print(len(cluster))
plot_north_west_density_for_column(show_eildon, location_data, \
cluster, new_col_name, location_X_cluster, \
location_Y_cluster, False)
cluster_north_west = add_density(north_west)
cluster_north_west['Density_trans'] = \
stats.boxcox(cluster_north_west['Density'], 0.5)
location_X_cluster_north_west = \
plot_data_range_plus(north_west['Location_X'], \
'Location_X - Northwest (minus Ireland)', None)
location_Y_cluster_north_west = \
plot_data_range_plus(north_west['Location_Y'], \
'Location_Y - Northwest (minus Ireland)', None)
show_eildon = True
Images will only be plotted below if the resulting filter of the data contains more sites, as a percentage of all sites in the area, than the show_percentage value set in User Settings above.
plot_north_west_dnesity_minus(show_eildon, north_west, \
location_X_cluster_north_west, \
location_Y_cluster_north_west)
print(str(len(north_west)) + " Records")
487 Records
management_encodeable_features = [
'Management_Condition_Extant',
'Management_Condition_Cropmark',
'Management_Condition_Destroyed',
'Management_Land_Use_Woodland',
'Management_Land_Use_Plantation',
'Management_Land_Use_Parkland',
'Management_Land_Use_Pasture',
'Management_Land_Use_Arable',
'Management_Land_Use_Scrub',
'Management_Land_Use_Outcrop',
'Management_Land_Use_Moorland',
'Management_Land_Use_Heath',
'Management_Land_Use_Urban',
'Management_Land_Use_Coastal',
'Management_Land_Use_Other']
plot_encodeable(hillforts_data, north_west, management_encodeable_features, show_percentage)
487
landscape_numeric_features = [
'Landscape_Altitude']
landscape_numeric_data = \
hillforts_data[landscape_numeric_features].copy()
landscape_numeric_data.head()
| Landscape_Altitude | |
|---|---|
| 0 | 276.0 |
| 1 | 150.0 |
| 2 | 225.0 |
| 3 | 150.0 |
| 4 | 338.0 |
location_landscape_numeric_data = \
pd.merge(location_numeric_data_short, landscape_numeric_data, left_index=True, \
right_index=True)
over_300 = \
location_landscape_numeric_data\
[(location_landscape_numeric_data['Landscape_Altitude'] >= 300) & \
(location_landscape_numeric_data['Landscape_Altitude'] < 400)].copy()
over_300['Landscape_Altitude'] = "Yes"
plot_altitude(hillforts_data, north_west, over_300, 'Landscape_Altitude', "300-400m", show_percentage)
7
over_200 = \
location_landscape_numeric_data\
[(location_landscape_numeric_data['Landscape_Altitude'] >= 200) & \
(location_landscape_numeric_data['Landscape_Altitude'] < 300)].copy()
over_200['Landscape_Altitude'] = "Yes"
plot_altitude(hillforts_data, north_west, over_200, 'Landscape_Altitude', "200-300m", show_percentage)
23
over_100 = \
location_landscape_numeric_data\
[(location_landscape_numeric_data['Landscape_Altitude'] >= 100) & \
(location_landscape_numeric_data['Landscape_Altitude'] < 200)].copy()
over_100['Landscape_Altitude'] = "Yes"
plot_altitude(hillforts_data, north_west, over_100, 'Landscape_Altitude', "100-200m", show_percentage)
100
over_000 = \
location_landscape_numeric_data\
[(location_landscape_numeric_data['Landscape_Altitude'] >= 0) & \
(location_landscape_numeric_data['Landscape_Altitude'] < 100)].copy()
over_000['Landscape_Altitude'] = "Yes"
plot_altitude(hillforts_data, north_west, over_000, 'Landscape_Altitude', "0-100m", show_percentage)
356
landscape_encodeable_features = [
'Landscape_Type_Contour',
'Landscape_Type_Partial',
'Landscape_Type_Promontory',
'Landscape_Type_Hillslope',
'Landscape_Type_Level',
'Landscape_Type_Marsh',
'Landscape_Type_Multiple',
'Landscape_Topography_Hilltop',
'Landscape_Topography_Coastal',
'Landscape_Topography_Inland',
'Landscape_Topography_Valley',
'Landscape_Topography_Knoll',
'Landscape_Topography_Ridge',
'Landscape_Topography_Scarp',
'Landscape_Topography_Hillslope',
'Landscape_Topography_Lowland',
'Landscape_Topography_Spur',
'Landscape_Aspect_N',
'Landscape_Aspect_NE',
'Landscape_Aspect_E',
'Landscape_Aspect_SE',
'Landscape_Aspect_S',
'Landscape_Aspect_SW',
'Landscape_Aspect_W',
'Landscape_Aspect_NW',
'Landscape_Aspect_Level']
plot_encodeable(hillforts_data, north_west, landscape_encodeable_features, show_percentage)
487
dating_encodeable_features = [
'Dating_Date_Pre_1200BC',
'Dating_Date_1200BC_800BC',
'Dating_Date_800BC_400BC',
'Dating_Date_400BC_AD50',
'Dating_Date_AD50_AD400',
'Dating_Date_AD400_AD800',
'Dating_Date_Post_AD800',
'Dating_Date_Unknown']
plot_encodeable(hillforts_data, north_west, dating_encodeable_features, show_percentage)
487
interior_encodeable_features = [
'Interior_Water_None',
'Interior_Water_Spring',
'Interior_Water_Stream',
'Interior_Water_Pool',
'Interior_Water_Flush',
'Interior_Water_Well',
'Interior_Water_Other',
'Interior_Surface_None',
'Interior_Surface_Round',
'Interior_Surface_Rectangular',
'Interior_Surface_Curvilinear',
'Interior_Surface_Roundhouse',
'Interior_Surface_Pit',
'Interior_Surface_Quarry',
'Interior_Surface_Other',
'Interior_Excavation_None',
'Interior_Excavation_Pit',
'Interior_Excavation_Posthole',
'Interior_Excavation_Roundhouse',
'Interior_Excavation_Rectangular',
'Interior_Excavation_Road',
'Interior_Excavation_Quarry',
'Interior_Excavation_Other',
'Interior_Excavation_Nothing',
'Interior_Geophysics_None',
'Interior_Geophysics_Pit',
'Interior_Geophysics_Roundhouse',
'Interior_Geophysics_Rectangular',
'Interior_Geophysics_Road',
'Interior_Geophysics_Quarry',
'Interior_Geophysics_Other',
'Interior_Geophysics_Nothing']
plot_encodeable(hillforts_data, north_west, interior_encodeable_features, show_percentage)
487
entrance_numeric_features = [
'Entrances_Breaks',
'Entrances_Original']
plot_numeric(hillforts_data, north_west, entrance_numeric_features, show_percentage)
487 [0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 9.0]
[0.0, 1.0, 2.0, 3.0, 4.0]
entrance_encodeable_features = [
'Entrances_Guard_Chambers',
'Entrances_Chevaux']
plot_encodeable(hillforts_data, north_west, entrance_encodeable_features, show_percentage)
487
def plot_enclosing_area_1(hillforts_data, north_west, numeric_features, show_percentage):
numeric_data = hillforts_data[numeric_features].copy()
location_data = pd.merge(north_west, numeric_data, \
left_index=True, right_index=True)
print(len(location_data))
ranges = [[0, 0.13], [0.13, 1], [1,50]]
for rng in ranges:
#Filter data
cluster = location_data[location_data['Enclosing_Area_1'].between(rng[0], rng[1])]
new_col_name = 'Enclosing_Area_1' + "_" + str(rng[0]) + "_to " + str(rng[1]) + "_Ha"
cluster = cluster.rename(columns={'Enclosing_Area_1': new_col_name})
show_eildon = False
if show_percentage < 10:
show_percentage = 10
if (len(cluster) / len(location_data)) * 100 > show_percentage:
# refresh density
cluster = renew_density(cluster)
location_X_cluster = \
plot_data_range_plus(cluster['Location_X'], '', None)
location_Y_cluster = \
plot_data_range_plus(cluster['Location_Y'], '', None)
#print(len(cluster))
plot_north_west_density_for_column(show_eildon, location_data, \
cluster, new_col_name, location_X_cluster, \
location_Y_cluster, False)
Only Enclosing Area 1 will be used. See Hillforts Primer Part 5.
enclosing_numeric_features_part1 = [
'Enclosing_Area_1'#,
#'Enclosing_Area_2',
#'Enclosing_Area_3',
#'Enclosing_Area_4',
#'Enclosing_Enclosed_Area',
#'Enclosing_Area',
]
plot_enclosing_area_1(hillforts_data, north_west, enclosing_numeric_features_part1, show_percentage)
487
print(enclosing_numeric_features_part1)
['Enclosing_Area_1']
enclosing_numeric_features_part2 = [
'Enclosing_Max_Ramparts',
'Enclosing_NE_Quadrant',
'Enclosing_SE_Quadrant',
'Enclosing_SW_Quadrant',
'Enclosing_NW_Quadrant',
'Enclosing_Ditches_Number']
plot_numeric(hillforts_data, north_west, enclosing_numeric_features_part2, show_percentage)
487 [0.0, 1.0, 2.0, 3.0, 4.0, 5.0]
[0.0, 1.0, 2.0, 3.0, 4.0, 5.0]
[0.0, 1.0, 2.0, 3.0, 4.0]
[0.0, 1.0, 2.0, 3.0, 4.0, 5.0]
[0.0, 1.0, 2.0, 3.0, 4.0]
[0.0, 1.0, 2.0, 3.0, 4.0, 5.0]
enclosing_encodeable_features = [
'Enclosing_Multiperiod',
'Enclosing_Circuit',
'Enclosing_Current_Part_Uni',
'Enclosing_Current_Uni',
'Enclosing_Current_Part_Bi',
'Enclosing_Current_Bi',
'Enclosing_Current_Part_Multi',
'Enclosing_Current_Multi',
'Enclosing_Current_Unknown',
'Enclosing_Period_Part_Uni',
'Enclosing_Period_Uni',
'Enclosing_Period_Part_Bi',
'Enclosing_Period_Bi',
'Enclosing_Period_Part_Multi',
'Enclosing_Period_Multi',
'Enclosing_Surface_None',
'Enclosing_Surface_Bank',
'Enclosing_Surface_Wall',
'Enclosing_Surface_Rubble',
'Enclosing_Surface_Walk',
'Enclosing_Surface_Timber',
'Enclosing_Surface_Vitrification',
'Enclosing_Surface_Burning',
'Enclosing_Surface_Palisade',
'Enclosing_Surface_Counter_Scarp',
'Enclosing_Surface_Berm',
'Enclosing_Surface_Unfinished',
'Enclosing_Surface_Other',
'Enclosing_Excavation_Nothing',
'Enclosing_Excavation_Bank',
'Enclosing_Excavation_Wall',
'Enclosing_Excavation_Murus',
'Enclosing_Excavation_Timber_Framed',
'Enclosing_Excavation_Timber_Laced',
'Enclosing_Excavation_Vitrification',
'Enclosing_Excavation_Burning',
'Enclosing_Excavation_Palisade',
'Enclosing_Excavation_Counter_Scarp',
'Enclosing_Excavation_Berm',
'Enclosing_Excavation_Unfinished',
'Enclosing_Excavation_No_Known',
'Enclosing_Excavation_Other',
'Enclosing_Gang_Working',
'Enclosing_Ditches']
plot_encodeable(hillforts_data, north_west, enclosing_encodeable_features, show_percentage)
487
annex_encodeable_features = ['Annex']
plot_encodeable(hillforts_data, north_west, annex_encodeable_features, show_percentage)
487
If you do not wish to download the data using this document, all the processed data packages, notebooks and images are available here:
https://github.com/MikeDairsie/Hillforts-Primer.
#download(dating_data_list, 'Dating_package')
if save_images:
path = os.path.join(IMAGES_PATH, f"fig_list_{part.lower()}.csv")
fig_list.to_csv(path, index=False)